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Abstract 


This chapter presents the detailed discussion on the effect of non-response on the 
estimator of population mean in a frequently used design, namely, stratified random 
sampling. In this chapter, our aim is to discuss the existing allocation schemes in 
presence of non-response and to suggest some new allocation schemes utilizing the 
knowledge of response and non-response rates of different strata. The effects of proposed 
schemes on the sampling variance of the estimator have been discussed and compared 
with the usual allocation schemes, namely, proportional allocation and Neyman 
allocation in presence of non-response. The empirical study has also been carried out in 


support of the results. 


Keywords: Stratified random sampling, Allocation schemes, Non-response, Mean 


squares, Empirical Study. 


1. Introduction 

Sukhatme (1935) has shown that by effectively using the optimum allocation in 
stratified sampling, estimates of the strata variances obtained in a previous survey or in a 
specially planned pilot survey based even on samples of moderate sample size would be 
adequate for increasing the precision of the estimator. Evans (1951) has also considered 
the problem of allocation based on estimates of strata variances obtained in earlier 
survey. According to literature of sampling theory, various efforts have been made to 
reduce the error which arises because of taking a part of the population, i.e., sampling 
error. Besides the sampling error there are also several non-sampling errors which take 
place from time to time due to a number of factors such as faulty method of selection and 
estimation, incomplete coverage, difference in interviewers, lack of proper supervision, 
etc. Incompleteness or non-response in the form of absence, censoring or grouping is a 


troubling issue of many data sets. 
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In choosing the sample sizes from the different strata in stratified random 
sampling one can select it in such a way that it is either exclusively proportional to the 
strata sizes or proportional to strata sizes along with the variation in the strata under 
proportional allocation or Neyman allocation respectively. If non-response is inherent in 
the entire population and so are in all the strata, obviously it would be quite impossible to 
adopt Neyman allocation because then the knowledge of stratum variability will not be 
available, rather the knowledge of response rate of different strata might be easily 
available or might be easily estimated from the sample selected from each stratum. Thus, 
it is quite reasonable to utilize the response rate (or non-response rate) while allocating 


samples to stratum instead of Neyman allocation in presence of non-response error. 


In the present chapter, we have proposed some new allocation schemes in 
selecting the samples from different strata based on response (non-response) rates of the 
strata in presence of non-response. We have compared them with Neyman and 


proportional allocations. The results have been shown with a numerical example. 


2. Sampling Strategy and Estimation Procedure 


In the study of non-response, according to one deterministic response model, it is 
generally assumed that the population is dichotomized in two strata; a response stratum 
considering of all units for which measurements would be obtained if the units happened 
to fall in the sample and a non-response stratum of units for which no measurement 
would be obtained. However, this division into two strata is, of course, an 
oversimplification of the problem. The theory involved in HH technique, is as given 


below: 


Let us consider a sample of size 1 is drawn from a finite population of size N . 


Let n, units in the sample responded and n, units did not respond, so thatn, +n, =n. 
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The n, units may be regarded as a sample from the response class and n, units as a 
sample from the non-response class belonging to the population. Let us assume that JN, 
and N, be the number of units in the response stratum and non-response stratum 
respectively in the population. Obviously, N, and N, are not known but their unbiased 


estimates can be obtained from the sample as 


n 


N,=n,N/n; N,=n,N/n. 


Let m be the size of the sub-sample from n, non-respondents to be interviewed. 


Hansen and Hurwitz (1946) proposed an estimator to estimate the population mean Xo 
of the study variable X, as 
_ n, Xo1 + n, Xom 


ae (2.1) 


Donn ’ 
nN 


which is unbiased for Xo, whereas xo: and xom are sample means based on samples of 


sizes n, and m respectively for the study variable X,. 


The variance of 7),,,, is given by 


| | L-1 
VCom)=| a ' W,Soo (2.2) 
Ny N, 2 2 : 
where L=—, W,= ae Sj andS>, are the mean squares of entire group and non- 


response group respectively in the population. 


Let us consider a population consisting of N units divided into xk strata. Let the 


size of i” stratum is N,, (i =1,2,...,k ) and we decide to select a sample of size n from the 


entire population in such a way that n, units are selected from the i” stratum. Thus, we 


k 
have }n, =n. 


i=l 
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Let the non-response occurs in each stratum. Then using Hansen and Hurwitz 


procedure we select a sample of size m, units out of n,,non-respondent units in the i” 


stratum with the help of simple random sampling without replacement (SRSWOR) such 
that n,, =L,m,, L, 21 and the information are observed on all the m, units by interview 


method. 


h 


The Hansen-Hurwitz estimator of population mean Xo: for the i” stratum will be 


* : x itn, x mi 
[= eee (i = 1,2,...,k) (2.3) 
Nn. 


t 


where xo and Xom are the sample means based on n,, respondent units and m, non- 


h 


respondent units respectively in the i” stratum. 


Obviously 7;. is an unbiased estimator of X o;. Combining the estimators over all 


strata we get the estimator of population mean Xo, given by 


k 
Tou = Dy PiTo (2.4) 


Obviously, we have 
Blt, Xo. (2.5) 


The variance of 7,., is given by 


* 1 1 A (L,-1 
VF = (2 Jesse Sa Yy, psi, (2.6) 
i i i=l i 


N, 
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N, 2 
=— =, §,, and S;,, are the mean squares of entire group and non-response 


i 


where W,, 
group respectively in the i” stratum. 


It is easy to see that under ‘proportional allocation’ (PA), that is, when n, = np, 


fOr CH 12K vr ele is obtained as 


* Af1l 1 
vir, Ie =¥ (2-2 ps’ aes ~YUL, —1)¥2pSon (2.7) 
i=l i=l 


; ; S67 bs 
whereas under the ‘Neyman allocation’ (NA), withn, mpewils (i = 1,2,...,4), it 1s 


YP Ss 
i=l 


equal to 


Mdu=2[S rsa] - ES rie +4{ $1, av, S| $8) 


(2.8) 


It is important to mention here that the last terms in the expressions (2.7) and (2.8) 
arise due to non-response in the population. Further, in presence of non-response in the 
population, Neyman allocation may or may not be efficient than the proportional 
allocation, a situation which is quite contrary to the usual case when population is free 


from non-response. This can be understood from the following: 
We have 


k 


y(L, -lp Si s) (2.9) 


i= 0i 


VF Ina —¥ oe a = *¥pilbs.- 3.) + 


x 
n 


k 
Sw = > PS: . 
i=l 
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Whole the first term in the above expression is necessarily positive, the second 


term may be negative and greater than the first term in magnitude depending upon the 


sign and magnitude of the term f-$| for all i. Thus, in presence of non-response in 
0i 


the stratified population, Neyman allocation does not always guarantee a better result as it 


is case when the population is free from non-response error. 
3. Some New Allocation Schemes 


It is a well known fact that in case the stratified population does not have non- 
response error and strata mean squares, Se (p312s.28); are known, it is always 


advisable to prefer Neyman allocation scheme as compared to proportional allocation 
scheme in order to increase the precision of the estimator. But, if the population is 
affected by non-response, Neyman allocation is not always a better proposition. This has 


been highlighted under the section 2 above. Moreover, in case non-response is present in 


strata, knowledge on strata mean squares, Se , are impossible to collect, rather direct 


estimates of Sj,,and S;,, may be had from the sample. Under these circumstances, it is, 


therefore, practically difficult to adopt Neyman allocation if non-response is inherent in 
the population. However, proportional allocation does not demand the knowledge of 
strata mean squares and rests only upon the strata sizes, hence it is well applicable even 


in the presence of non-response. 


As discussed in the section 2, unbiased estimates of response and non-response 
rates in the population are readily available and hence it seems quite reasonable to think 
for developing allocation schemes which involve the knowledge of population response 
(non-response) rates in each stratum. If such allocation schemes yield précised estimates 
as compared to proportional allocation, these would be advisable to adopt instead of 


Neyman allocation due to the reasons mentioned above. 


In this section, we have, therefore, proposed some new allocation schemes which 
utilize the knowledge of response (non-response) rates in subpopulations. While some of 


the proposed schemes do not utilize the knowledge of S;,, some others are proposed 
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based on the knowledge of S;, just in order to make a comparison of them with Neyman 


allocation under the presence of non-response. In addition to the assumptions of 
proportional and Neyman allocations, we have further assume it logical to allocate larger 
sample from a stratum having larger number of respondents and vice-versa when 


proposing the new schemes of allocations. 
Scheme-1[OA (1)]: 


Let us assume that larger size sample is selected from a larger size stratum and 


with larger response rate, that is, 
n,« pW, — for i=1,2,...,k. 
Then we have 


n, = Kp,W,, where K is a constant. 


The value of K will be 


Putting this value of n, in expression (2.6), we get 


, ie i eS Ciel ies 
v{r, | -1[S 00] 32 0 Lv op.Sis| | LS pS (3.2) 
i=l 


ial Wis Wi 
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Scheme-2[OA (2)]: 
Let us assume that 
n, < DW,So;- 
Then, we have 


n= np WS oj (3.3) 


L k 


YPM Sos 


i=l 
and hence the expression (2.6) becomes 


k 2 


* I\< PSo ,(L,-DWinp; So; 1X 
vir) -1/S 215s |} a Ao I Pe (3.4) 


i=l i=l 


Scheme-3[OA (3)]: 


Let us select larger size sample from a larger size stratum but smaller size sample 


if the non-response rate is high. That is, 


dD; 
nN, 
W,, 
Then 
n, = — (3.5) 
W. i 


and the expression of VT", | reduces to 


* 1 : Di 2 2 2 1 : 2 
v\re., 5 = 43: va [Shows + (L, —1F 3 P Son ] — 5 2 PS : (3.6) 
ist jo |L i=l i=l 
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Scheme-4[OA (4)]: 


Let 


n PS 3.7) 


i k Bae 
ies 


i=l 


The corresponding expression of vir | is 


[ a 1|< P Soi “ 2 Soir 1X 2 
ViFou , =—| ED PM aSo: + (L, —i2p; =P Sas (3.8) 
n| ‘= Wis i=l Soi Ne 
Scheme-5[OA (5)]: 
Let 
DW 
n,« =, 
Wi 
then 
n, = ea. (3.9) 
PF i 
W. 
ad W., 


The expression (2.6) gives 


2 2 2 
Hee <1] ame] yeas Wesel) 1983. G10 
i=l 


n W, i=l Wi Wi, Nia 
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Scheme-6[OA (6)]: 


if eee aida 


then, we have 


W.So, 
ene ee a (3.11) 
W. SP aSr 
i2 
i=] Wis 


In this case, V|Z, | becomes 


~)_1 . “| DWiSo; _ Li = WM Pi Soin 1X 2 
rte), = 4] 2eFae | Seva Cs Meee LF sh, 


i=l Wii 


(3.12) 


Remark 1: It is to be mentioned here that if response rate assumes same value in all the 
strata, that is W,, =W (say), then schemes 1, 3 and 5 reduces to ‘proportional allocation’, 
while the schemes 2, 4 and 6 reduces to ‘Neyman allocation’. The corresponding 
expressions, /|T;, lbs (r =1,3,5) are then similar to V|T;, he and V|T;, [i (r = 2,4,6) 


reduce to vir a le 


Remark 2: Although the theoretical comparison of expressions of vir © ie (r =1,3,5) 
and VT; Ils (r =2,4,6) with V|Z;, |, , and Vr lk , respectively is required in order to 


understand the suitability of the proposed schemes, but such comparisons do not yield 
explicit solutions in general. The suitability of a scheme does depend upon the parametric 
values of the population. We have, therefore, illustrated the results with the help of some 


empirical data. 
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4, Empirical Study 


In order to investigate the efficiency of the estimator 7, under proposed 
allocation schemes, based on response (non-response) rates, we have considered here an 


empirical data set. 


We have taken the data available in Sarndal et. al. (1992) given in Appendix B. 
The data refer to 284 municipalities in Sweden, varying considerably in size and other 
characteristics. The population consisting of the 284 municipalities is referred to as the 


MU284 population. 


For the purpose of illustration, we have randomly divided the 284 municipalities 
into four strata consisting of 73, 70, 97 and 44 municipalities. The 1985 population (in 
thousands) has been considered as the study variable, X,. 


On the basis of the data, the following values of parameters were obtained: 


Table 1: Particulars of the Data 


(N = 284) 
Mean Square of 
Stratum Mean 
Stratum Size Stratum Mean the Non- 
= Square response Group 
(i) (N,) oar, 2 
(s2) 2 Ais 
(Ss. )= 50 
1 73 40.85 6369.10 5095.28 
2 70 27.83 1051.07 840.86 
3 97 25.78 2014.97 1611.97 
4 44 20.64 538.47 430.78 


We have taken sample size, n= 60. 
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Tables 2 depicts the values of sample sizes, n, (i=1,2,3,4) and values of vr | 
under PA, NA and proposed schemes OA(1) to OA(6) for different selections of the 
values of L, and W,, (i=1,2,3,4). 
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(Z, =2.0, 2.5, 1.5, 3.5 for i= 1, 2, 3, 4 respectively) 


Table 2 
Sample Sizes and Variance of 7,,, under Different Allocation Schemes 


Stratum NOM: Sample Size (n, ) and vr Re | under 
response ; 
Rate 
Ww.) PA NA OA(1) OA(2) OA(3) OA(4) OA(5) OA(6) 
i2 
(Percent) Nn; vir, | Nn, vir, | Nn, vir, | Nn, vr, | Nn, vir, | Nn, vir, | Nn, vir, | Nn; vr, | 
1 20 15 | 43.08 | 26 | 3604 | 17 41.02 | 28 | 116.59 | 20 | 38.43 | 31 | 38.43 | 22 | 37.85 | 33 | 40.25 
2 25 15 10 15 10 15 10 15 10 
3 30 21 20 18 18 16 17 14 
4 35 9 8 4 7 3 6 3 
1 35 15 | 45.97 | 26 | 37.27 | 14 | 49.17 | 24 | 117.37 | 12 | 55.41 | 21 | 39.07 | 10 | 60.76 | 19 | 40.72 
2 30 15 10 14 10 13 10 13 10 
21 21 22 - e 24 
3 25 21 19 .F 5 5 : 
4 20 9 5 7 14 
1 25 15 | 43.91 | 26 | 3630 | 16 43.40 | 27 | 11654 | 16 | 4415 | 27 | 37.76 | 16| 4469 | 27 | 38.94 
15 16 19 21 14 
2 20 10 11 13 
21 ms 20 ia 17 16 
3 30 9 8 18 7 17 6 3 
4 35 5 4 3 
1 20 15 | 4317 | 26 | 35.99 | 17. 41.32 | 28 | 115.40 | 20 | 39.45 | 32 | 38.82 | 22 | 39.73 | 34] 41.30 
A - ae 6 15 a 16 10 16 10 
; as 21 a 19 17 . 14 . Fi 
9 9 . 4 
4 30 5 


a2 


5. Concluding Remarks 


In the present chapter, our aim was to accommodate the non-response error 
inherent in the stratified population during the estimation procedure and hence to suggest 
some new allocation schemes which utilize the knowledge of response (non-response) 
rates of strata. As discussed in different sub-sections, Neyman allocation may sometimes 
produce less précised estimates of population mean in comparison to proportional 
allocation if non-response is present in the population. Moreover, Neyman allocation is 
sometimes impractical in such situation, since then neither the knowledge ofS), 
(i =1,2,3,4), the mean squares of the strata, will be available, nor these could be estimated 
easily from the sample. In contrast to this, what might be easily known or could be 
estimated from the sample are response (non-response) rates of different strata. It was, 
therefore, thought to propose some new allocation schemes depending upon response 


(non-response) rates. 


A look of Table 2 reveals that in most of the situations (under different 


combinations of W,, andL,), allocation schemes OA (1), OA (3) and OA (5), depending 


solely upon the knowledge of p, and W,, (orW,,), produce more précised estimates as 
compared to PA. Further, as for as a comparative study of schemes OA (1), OA (3) and 
OA (5) is concerned, no doubt, all these schemes are more or less similar in terms of their 
efficiency. Thus, in addition to the knowledge of strata sizes, p,, the knowledge of 
response (non-response) rates, W,, (or W,,), while allocating sample to different strata; 
certainly adds to the precision of the estimate. 

It is also evident from the table that the additional information on the mean 
squares of strata certainly adds to the precision of the estimate, but this contribution is not 


very much significant in comparison to NA. Scheme OA (2) is throughout worse than 


any other scheme. 
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